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Abstract. Several widely-used textbooks lead the reader to believe 
that solving a linear system of equations Ax — b by multiplying b 
by a computed inverse inv(A) is inaccurate. Virtually all other text- 
books on numerical analysis and numerical linear algebra advise against 
using computed inverses without stating whether this is accurate or 
not. In fact, under reasonable assumptions on how the inverse is com- 
puted, x=inv(A)*b is as accurate as the solution computed by the best 
backward-stable solvers. This fact is not new, but obviously obscure. We 
review the literature on the accuracy of this computation and present a 
self-contained numerical analysis of it. 



1. Introduction 

Can you accurately compute the solution to a linear equation Ax = b by 
first computing an approximation V to A' 1 and then multiplying b by V 
(x=inv(A)*b in Matlab)? 

Unfortunately, most of the literature provides a misleading answer to this 
question. Many textbooks, including recent and widely-used ones, mislead 
the reader to think that x=inv(A)*b is less accurate than x=A\b, which 
computes the LU factorization of A with partial pivoting and then solves 
for x using the factors [6, p. 31], [10, p. 53], [7, p. 50], [11, p. 77], [2, p. 166], 
[13, pp. 184, 235, and 246]. Other textbooks warn against using a computed 
inverse for performance reasons without saying anything about accuracy. If 
you still dare use x=inv(A)*b in Matlab code, Matlab's analyzer issues a 
wrong and misleading warning [9] . 

As far as we can tell, only two sources in the literature present a correct 
analysis of this question. One is almost 50 years old [16, pp. 128-129], 
and is therefore hard to obtain and somewhat hard to read. The other is 
recent, but relegates this analysis to the solution of an exercise, rather than 
including it in the 27-page chapter on the matrix inverse [8, p. 559; see 
also p. 260]; even though the analysis there shows that x=inv(A)*b is as 
accurate as x=A\b, the text ends by stating that "multiplying by an explicit 
inverse is simply not a good way to solve a linear system" . The reader must 
pay careful attention to the analysis if he or she is to answer our question 
correctly. 

Our aim in this article is to clarify to researchers (and perhaps also to 
educators and students) the numerical properties of a solution to Ax = b 
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that is obtained by multiplying by a computed inverse. We do not present 
new results; we present results that are known, but not as much as they 
should be. 

Computing the inverse requires more arithmetic operations than comput- 
ing an LU factorization. We do not address the question of computational 
efficiency, but we do note that there is evidence that using the inverse is 
sometimes preferable from the performance perspective [3]. 

It also appears that explicit inverses are sometimes used when the inverse 
must be applied in hardware, as in some MIMO radios [5, 15]. The numeri- 
cal analysis in the literature and in this paper does not apply as-is to these 
computations, because hardware implementations typically use fixed-point 
arithmetic rather than floating point. Still, the analysis that we present 
here provides guiding principles to all implementations (e.g., to solve for the 
rows of the inverse using a backward-stable solver), and it may also pro- 
vide a template for an analysis of fixed-point implementations or alternative 
inversion algorithms. 

The rest of this paper is organized as follows. Section 2 presents the naive 
numerical analysis that probably led many authors to claim that x=inv(A) *b 
is inaccurate; the analysis is correct, but the error bound that it yields is 
too loose. Section 3 presents a much tighter analysis, due to Wilkinson; 
Higham later showed that this bound holds even in the componentwise sense. 
Section 4 explains another aspect of computed inverses that is not widely 
appreciated: that they are typically good for applying either from the left 
or from the right, but not both. Even when x=inv(A)*b is accurate, x is 
usually not backward stable; Section 5 discusses conditions under which x 
is also backward stable. To help the reader fully understand all of these 
results, Section 6 demonstrates them using simple numerical experiments. 
We present concluding remarks in Section 7. 

2. A Loose Bound 

Why did the inverse acquire its bad reputation? Good inversion methods 
produce a computed inverse V that is, at best, conditionally accurate, 



We cannot hope for an unconditional bound of 0(e mac hme) on the relative 
forward error. Some inversion methods guarantee conditional accuracy (for 
example, computing the inverse column by column using a backward stable 
linear solver). In particular, lapack's xGETRI satisfies (2.1), and also a 
componentwise conditional bound [8, p. 268]. That is, each entry in the 
computed inverse that xGETRI produces is conditionally accurate. It appears 
that Matlab's inv function also satisfies (2.1). 

Let's try to use (2.1) to obtain a bound on the forward error \\xy — 
x\\. Multiplying b by V in floating point produces xy that satisfies xy = 





V -A- 1 
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(V + A) b for some A with ||A||/||V|| = 0(e machinc ). Denoting T = V-A' 1 , 
we have 

x v = (V + A)b 

= [A~ 1 +T + A)b 
= (A' 1 + r + A) Ax 
= x + TAx + AAx , 

so 

\\ x v — x \\ 5; ||r|| ||x|| + ||A|| \\A\\ \\x\\ 

< 0(K(vl)e machinc ) \\A\\ \\x\\ + 0(e machine ) ||V|| \\A\\ \\x\\ 

(2.2) = 0(Av 2 (4)e mac hine) \\x\\ + O(emachine) \\V\\ \\A\\ \\x\\ . 

Unless A is so ill conditioned that the left-hand side of (2.1) is larger than 
1 (any constant would do), ||V|| = 0(||vl -1 ||). Therefore, 

(2.3) 1 1 a;y - x\\ < 0(K 2 (4)e machine ) . 

In contrast, solving Ax = b using a backward stable solver such as one 
based on the QR factorization (or on an LU factorization with partial piv- 
oting provided there is no growth) yields xtmckward-stabie f° r which 

(2.4) 1 1 ^backward-stable x \\ ^ 0('^( J 4)e mac l 1 ine) ||^|| • 

The bound (2.3) is correct, but it is just an upper bound on the error, 
and it turns out that it is loose by a factor of k(A). It appears that this 
easy-to-derive but loose bound gave the matrix inverse its bad reputation. 
In fact, xy satisfies an accuracy bound just like (2.4). 

3. Tightening the Bound 

The bound (2.3) is loose because of a single term, ||r|| \\A\\, which we used 
to bound the norm of TA. The other term in the bound, 0(K(A)e raac \ l i ne ) \\x\\, 
is tight. 

The key insight is that rows of F = V — A -1 tend to lie mostly in the 
directions of left singular vectors of A that are associated with small singular 
values. The smaller the singular value of A, the stronger the influence of 
the corresponding singular vector (or singular subspace) on the rows of T. 
Therefore, the norm of the product of T and A is much smaller than the 
product of the norms; A shrinks the strong directions of the error matrix T. 
This explains why the norm of TA is small. This relationship between the 
singular vectors of A and T depends on a backward stability criterion on V, 
which we define and analyze below. 

Suppose that we use a backward stable solver to compute the rows of V 
one by one by solving V{A = ej where e, is row i of I. Each computed row 
satisfies 

Vi (A + Ei) = a 
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with [|Sj||/[|yl|| = 0(e mac hine)- Rearranging the equation, we obtain 



VA — I 



so ||FA-J|| = 0(\\V\\ ||^4||e mac hine) = 0(/«(^)e mac hmc)- For a componentwise 
version of this bound and related bounds for other methods of computing 
V, see [8, section 14.3]. 

This is the key to the conditional accuracy of xy. Since TA = (V — 
A~ 1 )A = VA — I, the norm of TA is 0(ft(-A)e mac hi n e)- We therefore have 
the following theorem. 

Theorem 1. Let Ax = b be a linear system with a coefficient matrix that 
satisfies K>(A)e mac hine = 0(1)- Assume that V is an approximate inverse 
of A that satisfies \\VA — I\\ = 0(n(A)e mac hine) ■ Then the floating-point 
product xy of V and b satisfies 



\Xy 



X 



X 



O (k ( ^4) € machine) 



The essence of this analysis appears in Wilkinson's 1963 monograph [16, 
pp. 128-129]. Wilkinson did not account for the rounding errors in the 
multiplication Vb, which are not asymptotically significant, but otherwise 
his analysis is complete and correct. 



4. Left and Right Inverses 

In this article, we multiply b by the inverse from the left to solve Ax = b. 
This implies that the approximate inverse V should be a good left inverse. 
Indeed, we have seen that a V with a small left residual VA — I guarantees a 
conditionally accurate solution xy. Whether V is also a good right inverse, 
in the sense that AV — I is small, is irrelevant for solving Ax = b. If we 
were trying to solve x T A = b T ', we would need a good right inverse. 

Wilkinson noted that if rows of V are computed using LU with partial 
pivoting, then V is usually both a good left inverse and a good right inverse, 
but not always [16, page 113]. Du Croz and Higham show matrices for which 
this is not the case, but they also note that such matrices are the exception 
rather than the rule [4] . 

Other inversion methods tend to produce a matrix that is either a left 
inverse or a right inverse but not both. A good example is Newton's method. 
If one iterates with V® = (2I-V^~ 1 ^A)V^ t ~ 1 ^ then converges to a left 
inverse. If one iterates with V® = V ( f~ 1 \2I - AV^) then V® converges 
to a right inverse. 

Strassen's inversion formula [1, 14] sometimes produces an inverse that is 
neither a good left inverse nor a good right inverse [8, Section 26.3.2]. 
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5. Multiplication by the Inverse is (Sometimes) Backward 

Stable 

The next section presents a simple example in which the computed solu- 
tion xy is conditionally accurate but not backward stable. In this section 
we show that under certain conditions, the solution is also backward stable. 
The analysis also clarifies in what ways backward stability can be lost. 

Suppose that we use a V that is a good right inverse, \\AV — I\\ = 
^machine)- We can produce such a V by solving for its columns using 
a backward-stable solver. We have 

Ax v -b = A{V + A)b-b 

= (AV-I)b + AAb 

for some A with [|A[|/[|V|| = 0(e macn ine)- Here too, the A term does not 
influence the asymptotic upper bound. The assumption that V is a good 
right inverse bounds the other term, 

\\Ax v -b\\ < ||yiy-/||||6|| + p||||A||||6|| 

< 0(/€(A)e mac hine) ||b|| + O(emachinc) \\A\\ \\V\\ \\b\\ 

(5.1) = 0(K(A)e mi , chine )\\b\\ . 

The relative backward error is given by the expression || Axy— 6||/(||A|| ||xy || + 
[12]. Filling in the bound on the norm of the residual, we obtain 

\\Ax v -b\\ < \\Ax v -b\\ 



\A\\\\x v \\ + \\b\\ ~ \\A\\\\x v \\ 

I A 1 1 1 1 A II Cmachine | 



\A\ | Xy] 



= o 
o(\ A 

^ I ii ii fc machine 
V \\ X V 

If we assume that V is a reasonable enough left inverse so that at least 
\\xv\\ is close to \\x\\ (that is, if the forward error is 0(1)), then a solution 
x that has norm close to H^l^ 1 !! ||6|| guarantees backward stability to within 
O(emachine)- Let A = LT,R* be the SVD of A, so 

x = A^b 

= RY,- l L*b 
L*b. 



E 



where L« and Ri are the left and right singular vectors of A. If L* n b = 0(||6[|), 
then ||x|| > \L* n b\/a n = 0(||A _1 || ||6||) and xy is backward stable. If the 
projection of b on L n is not large but the projection on, say, is large 

and <7 n _i is close to a n , the solution is still backward stable, and so on. 

Perhaps more importantly, we have now identified the ways in which xy 
can fail to be backward stable: 
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(1) V is not a good right inverse, or 

(2) V is such a poor left inverse that \\xy\\ is much smaller than \\x\\, or 

(3) the projection of b on the left singular vectors of A associated with 
small singular values is small. 

The next section shows an example that satisfies the last condition. 

6. Numerical Examples 

Let us demonstrate the theory with a small numerical example. We set 
n = 256, u\ = 10 and a n = 10 -4 , generate a random matrix A with k(A) = 
10 s , and generate its inverse. The matrix and the inverse are produced 
by matrix multiplications, and each multiplication has at least one unitary 
factor, so both are accurate to within a relative error of about ^machine- 

We 

also compute an approximate inverse V using MATLAB's inv function. 
[L , dummy , R] = svd(randn(n) ) ; 

svalues = logspace(loglO(sigma_l) , loglO(sigma_n) , n) ; 

S = diag (svalues) ; 

invS = diag(svalues . ~-l) ; 

A = L * S * R' ; 

Accuratelnv = R * invS * L' ; 

V = inv (A) ; 

The approximate inverse V is only conditionally accurate, as predicted by 
(2.1), but its use as a left inverse leads to a conditionally small residual. 

Gamma = V - Accuratelnv; 

norm(Gamma) / norm(Accuratelnv) 
ans = 3.4891e-09 

norm(V * A - eye(n)) 
ans = 1.6976e-08 
We now generate a random right hand-side b and use the inverse to solve 
Ax = b. The result is backward stable to within a relative error of e mac hme- 

b = randn(n, 1) ; 

x = R * (invS * (L' * b)) ; 

X v = V * b; 

norm (A * xv - b) / (norm (A) * norm(xv) + norm(b)) 
ans = 8.8078e-16 

Obviously, the solution should be conditionally accurate, and it is. 
norm(xv - x) / norm(x) 
ans = 3. 102e-09 

We now perform a similar experiment, but with a random x, which leads 
to a right-hand side b which is nearly orthogonal to the left singular vectors 
of A that correspond to small singular values; now the solution is only 
conditionally backward stable. 

x = randn(n, 1) ; 

b = L * (S * (R> * x)) ; 
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xv = V * b; 

norm (A * xv - b) / (norm (A) * norm(xv) + norm(b)) 
ans = 2. 1352e-10 

Theorem 1 predicts that the solution should still be conditionally accu- 
rate. It is. Matlab's backslash operator, which is a linear solver based 
on Gaussian elimination with partial pivoting, produces a solution with a 
similar accuracy. 

norm((A\b) - x) / norm(x) 

ans = 4.0801e-09 
norm(xv - x) / norm(x) 
ans = 4.5699e-09 

The magic is in the special structure of rows of T. Figure 6.1 displays this 
structure graphically. We can see that a row of T is almost orthogonal to 
the left singular vectors of A associated with large singular values, and that 
the magnitude of the projections increases with decreasing singular values. 
If we produce an approximate inverse with the same magnitude of error 
as in inv(A) but with a random error matrix, it will not solve Ax = b 
conditionally accurately. 

Badlnv = Accuratelnv + norm (Gamma) * randn(n) ; 

xv = Badlnv * b; 

norm (A * xv - b) / (norm (A) * norm(xv) + norm(b)) 

ans = 0.075727 
norm(xv - x) / norm(x) 

ans = 0.83552 

7. Closing Remarks 

Solving a linear system of equations Ax = b using a computed inverse 
V produces a conditionally accurate solution, subject to an easy to satisfy 
condition on the computation of V. Using Gaussian elimination with partial 
pivoting or a QR factorization produces a solution with errors that have the 
same order of magnitude as those produced by V. 

If the right-hand side b does not have any special relationship to the left 
singular subspaces of A, then the solution produced by V is also backward 
stable (under a slightly different technical condition on V), and hence as 
good as a solution produced by GEPP or QR. As far as we know, this 
result is new. 

If b is close to orthogonal to the left singular subspaces of A corresponding 
to small singular values, then the solution produced by V is conditionally 
accurate, but usually not backward stable. Whether this is a significant 
defect or not depends on the application. In most applications, it is not a 
serious problem. 

One difficulty with a conditionally-accurate solution that is not backward 
stable is that it does not come with a certificate of conditional accuracy. We 
normally take a small backward error to be such a certificate. 
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Figure 6.1. The magnitude of the projections of three rows 
of r (the first, last, and middle, but all rows produce similar 
plots) on the left singular vectors of A, as a function of the 
corresponding singular values of A. 



There might be applications that require a backward stable solution rather 
than an accurate one. Strangely, this is exactly the case with the computa- 
tion of V itself; the analysis in this paper relies on rows being computed in 
a backward-stable way, not on their forward accuracy. We are not aware of 
other cases where this is important. 



References 

[1] David H. Bailey and Helaman R. P. Ferguson. A Strassen-Newton algorithm for 

high-speed parallelizable matrix inversion. In Proceedings of the 1988 ACM/IEEE 

conference on Supercomputing, pages 419-424, 1988. 
[2] S. D. Conte and Carl de Boor. Elementary Numerical Analysis: An Algorithmic 

Approach. McGraw-Hill Book Company, third edition, 1980. 
[3] Adi Ditkowski, Cadi Fibich, and Nir Gavish. Efficient solution of Ax^ = using 

A . Journal of Scientific Computing, 32:29-44, 2007. 
[4] Jeremy J. Du Croz and Nicholas J. Higham. Stability of methods for matrix inversion. 

IMA Journal of Numerical Analysis, 12(1):1-19, 1992. 
[5] Stefan Eberli, Davide Cescato, and Wolfgang Fichtner. Divide-and-conquer matrix 

inversion for linear MMSE detection in SDR MIMO receivers. In Proceedings of the 

26th Norchip Conference, pages 162-167. IEEE, 2008. 
[6] George E. Forsythe, Michael A. Malcolm, and Cleve B. Moler. Computer Methods for 

Mathematical Computations. Prentice-Hall, Inc., 1977. 
[7] Michael T. Heath. Scientific Computing: An Introductory Survey. The McGraw-Hill 

Companies, 1997. 

[8] Nicholas J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, second 
edition, 2002. 



How Accurate is inv(A)*b? 



9 



[9] The MathWorks, Inc. M-Lint: Matlab's code analyzer, 2010. Release 2010b. 
[10] Cleve Moler. Numerical Computing with MATLAB. SIAM, 2004. 
[11] Dianne P. O'Leary. Scientific Computing with Case Studies. SIAM, 2009. 
[12] J. L. Rigal and J. Caches. On the compatibility of a given solution with the data of 

a linear system. J. ACM, 14:543-548, July 1967. 
[13] G. W. Stewart. Matrix Algorithms. Volume I: Basic Decompositions. SIAM, 1998. 
[14] V. Strassen. Gaussian elimination is not optimal. Numererische Mathematik, 13:354- 

355, 1969. 

[15] C. Studer, S. Fateh, and D. Seethaler. ASIC implementation of soft-input soft-output 
MIMO detection using MMSE parallel interference cancellation. IEEE Journal of 
Solid-State Circuits, 46:1754-1765, 2011. 

[16] J. H. Wilkinson. Rounding Errors in Algebraic Processes. Prentice Hall, Inc., 1963. 



